Embedding similarity evaluation : English-French

Level 0 : Basic Overview Level 1 : Retrieval Level 2 : Topic‑Level Level 3 : Error‑Type
Model Model Info Dist Chart Rand Chart Basic Stats CMC Curve Retrieval Stats UMAP (Topics & Lang1) UMAP (Topics & Lang2) UMAP (Sent‑Len) Topic Avg Cosine Error % Bar Chart
all_mpnet_base_v2
Statistic Value
arch mpnet
hidden_size 768
layers 12
vocab_size 30527
Statistic Value
mean_true 0.562376
std_true 0.230922
mean_random 0.149132
std_random 0.106725
snr 3.872047
ks_p_value 0.000000
Statistic Value
recall@k 0.547000
precision@k 0.109400
mean reciprocal rank 0.540176
all_MiniLM_L6_v2
Statistic Value
arch bert
hidden_size 384
layers 6
vocab_size 30522
Statistic Value
mean_true 0.510811
std_true 0.247928
mean_random 0.111792
std_random 0.098802
snr 4.038573
ks_p_value 0.000000
Statistic Value
recall@k 0.51120
precision@k 0.10224
mean reciprocal rank 0.50709
all_roberta_large_v1
Statistic Value
arch roberta
hidden_size 1024
layers 24
vocab_size 50265
Statistic Value
mean_true 0.590186
std_true 0.213847
mean_random 0.131289
std_random 0.106083
snr 4.325826
ks_p_value 0.000000
Statistic Value
recall@k 0.675850
precision@k 0.135170
mean reciprocal rank 0.645148
paraphrase_mpnet_base_v2
Statistic Value
arch mpnet
hidden_size 768
layers 12
vocab_size 30527
Statistic Value
mean_true 0.513843
std_true 0.242570
mean_random 0.158809
std_random 0.097676
snr 3.634797
ks_p_value 0.000000
Statistic Value
recall@k 0.423650
precision@k 0.084730
mean reciprocal rank 0.445514
paraphrase_MiniLM_L6_v2
Statistic Value
arch bert
hidden_size 384
layers 6
vocab_size 30522
Statistic Value
mean_true 0.486098
std_true 0.249639
mean_random 0.143441
std_random 0.109846
snr 3.119418
ks_p_value 0.000000
Statistic Value
recall@k 0.375400
precision@k 0.075080
mean reciprocal rank 0.409981
bert_base_nli_mean_tokens
Statistic Value
arch bert
hidden_size 768
layers 12
vocab_size 30522
Statistic Value
mean_true 0.658200
std_true 0.197101
mean_random 0.369052
std_random 0.134123
snr 2.155836
ks_p_value 0.000000
Statistic Value
recall@k 0.373550
precision@k 0.074710
mean reciprocal rank 0.418525
LaBSE
Statistic Value
arch bert
hidden_size 768
layers 12
vocab_size 501153
Statistic Value
mean_true 0.880885
std_true 0.097555
mean_random 0.182254
std_random 0.096057
snr 7.273065
ks_p_value 0.000000
Statistic Value
recall@k 0.977600
precision@k 0.195520
mean reciprocal rank 0.969223
distiluse_base_multilingual_cased_v2
Statistic Value
arch distilbert
hidden_size 768
layers 6
vocab_size 119547
Statistic Value
mean_true 0.870821
std_true 0.117403
mean_random 0.024481
std_random 0.083102
snr 10.184350
ks_p_value 0.000000
Statistic Value
recall@k 0.96985
precision@k 0.19397
mean reciprocal rank 0.95929
paraphrase_multilingual_MiniLM_L12_v2
Statistic Value
arch bert
hidden_size 384
layers 12
vocab_size 250037
Statistic Value
mean_true 0.868893
std_true 0.125884
mean_random 0.177900
std_random 0.133930
snr 5.159371
ks_p_value 0.000000
Statistic Value
recall@k 0.935650
precision@k 0.187130
mean reciprocal rank 0.920887
paraphrase_multilingual_mpnet_base_v2
Statistic Value
arch xlm-roberta
hidden_size 768
layers 12
vocab_size 250002
Statistic Value
mean_true 0.902621
std_true 0.106138
mean_random 0.225598
std_random 0.126597
snr 5.347846
ks_p_value 0.000000
Statistic Value
recall@k 0.95175
precision@k 0.19035
mean reciprocal rank 0.93809
paraphrase_xlm_r_multilingual_v1
Statistic Value
arch xlm-roberta
hidden_size 768
layers 12
vocab_size 250002
Statistic Value
mean_true 0.871210
std_true 0.112547
mean_random 0.188367
std_random 0.106660
snr 6.402067
ks_p_value 0.000000
Statistic Value
recall@k 0.961500
precision@k 0.192300
mean reciprocal rank 0.949177
xlm_r_distilroberta_base_paraphrase_v1
Statistic Value
arch xlm-roberta
hidden_size 768
layers 12
vocab_size 250002
Statistic Value
mean_true 0.871210
std_true 0.112547
mean_random 0.188367
std_random 0.106660
snr 6.402067
ks_p_value 0.000000
Statistic Value
recall@k 0.961500
precision@k 0.192300
mean reciprocal rank 0.949177
stsb_xlm_r_multilingual
Statistic Value
arch xlm-roberta
hidden_size 768
layers 12
vocab_size 250002
Statistic Value
mean_true 0.904620
std_true 0.103386
mean_random 0.183683
std_random 0.148406
snr 4.857887
ks_p_value 0.000000
Statistic Value
recall@k 0.949950
precision@k 0.189990
mean reciprocal rank 0.936294
xlm_r_bert_base_nli_stsb_mean_tokens
Statistic Value
arch xlm-roberta
hidden_size 768
layers 12
vocab_size 250002
Statistic Value
mean_true 0.904620
std_true 0.103386
mean_random 0.183683
std_random 0.148406
snr 4.857887
ks_p_value 0.000000
Statistic Value
recall@k 0.949950
precision@k 0.189990
mean reciprocal rank 0.936294
xlm_r_100langs_bert_base_nli_stsb_mean_tokens
Statistic Value
arch xlm-roberta
hidden_size 768
layers 12
vocab_size 250002
Statistic Value
mean_true 0.904620
std_true 0.103386
mean_random 0.183683
std_random 0.148406
snr 4.857887
ks_p_value 0.000000
Statistic Value
recall@k 0.949950
precision@k 0.189990
mean reciprocal rank 0.936294
xlm_r_100langs_bert_base_nli_mean_tokens
Statistic Value
arch xlm-roberta
hidden_size 768
layers 12
vocab_size 250002
Statistic Value
mean_true 0.941180
std_true 0.069567
mean_random 0.361084
std_random 0.159629
snr 3.634023
ks_p_value 0.000000
Statistic Value
recall@k 0.939800
precision@k 0.187960
mean reciprocal rank 0.924258
distilbert_multilingual_nli_stsb_quora_ranking
Statistic Value
arch distilbert
hidden_size 768
layers 6
vocab_size 119547
Statistic Value
mean_true 0.963661
std_true 0.040054
mean_random 0.766971
std_random 0.071925
snr 2.734651
ks_p_value 0.000000
Statistic Value
recall@k 0.901200
precision@k 0.180240
mean reciprocal rank 0.877196
quora_distilbert_multilingual
Statistic Value
arch distilbert
hidden_size 768
layers 6
vocab_size 119547
Statistic Value
mean_true 0.963661
std_true 0.040054
mean_random 0.766971
std_random 0.071925
snr 2.734651
ks_p_value 0.000000
Statistic Value
recall@k 0.901200
precision@k 0.180240
mean reciprocal rank 0.877196
xlm_r_large_en_ko_nli_ststb
Statistic Value
arch xlm-roberta
hidden_size 1024
layers 24
vocab_size 250002
Statistic Value
mean_true 0.872944
std_true 0.125564
mean_random 0.191898
std_random 0.145794
snr 4.671275
ks_p_value 0.000000
Statistic Value
recall@k 0.916650
precision@k 0.183330
mean reciprocal rank 0.898344
xlm_r_base_en_ko_nli_ststb
Statistic Value
arch xlm-roberta
hidden_size 768
layers 12
vocab_size 250002
Statistic Value
mean_true 0.858422
std_true 0.124144
mean_random 0.226259
std_random 0.162901
snr 3.880651
ks_p_value 0.000000
Statistic Value
recall@k 0.886750
precision@k 0.177350
mean reciprocal rank 0.864433
clip_ViT_B_32_multilingual_v1
Statistic Value
arch distilbert
hidden_size 768
layers 6
vocab_size 119547
Statistic Value
mean_true 0.965292
std_true 0.029735
mean_random 0.838471
std_random 0.064127
snr 1.977647
ks_p_value 0.000000
Statistic Value
recall@k 0.735700
precision@k 0.147140
mean reciprocal rank 0.734473